Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models

Identifieur interne : 001565 ( Main/Exploration ); précédent : 001564; suivant : 001566

Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models

Auteurs : Takashi Okada [Japon] ; Atsuhiro Takasu [Japon] ; Jun Adachi [Japon]

Source :

RBID : ISTEX:DF534A7FD00E3F6BCB2FC98D05CB828CE9813463

Abstract

Abstract: Article citations are composed of subfields such as author, title, journal, and year. It is useful to automatically identify attributes of these subfields, since they are used for linking a citation with the actual cited article. In this article, we employ a Support Vector Machine (SVM), a method of machine learning, to automatically identify subfields. We then employ a Hidden Markov Model (HMM) to improve the identification accuracy. Information from the subfields identified by the SVM, and syntactic information analyzed by the HMM, are integrated to make an accurate identification.

Url:
DOI: 10.1007/978-3-540-30230-8_46


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models</title>
<author>
<name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
</author>
<author>
<name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
</author>
<author>
<name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:DF534A7FD00E3F6BCB2FC98D05CB828CE9813463</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-30230-8_46</idno>
<idno type="url">https://api.istex.fr/document/DF534A7FD00E3F6BCB2FC98D05CB828CE9813463/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000639</idno>
<idno type="wicri:Area/Istex/Curation">000631</idno>
<idno type="wicri:Area/Istex/Checkpoint">000E00</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Okada T:bibliographic:component:extraction</idno>
<idno type="wicri:Area/Main/Merge">001616</idno>
<idno type="wicri:Area/Main/Curation">001565</idno>
<idno type="wicri:Area/Main/Exploration">001565</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models</title>
<author>
<name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Japon</country>
<wicri:regionArea>Information Science and Technology, Information and Communication Engineering, The University of Tokyo, 7-3-1 Bunkyo-ku, Tokyo</wicri:regionArea>
<placeName>
<settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Japon</country>
</affiliation>
</author>
<author>
<name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Japon</country>
<wicri:regionArea>National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo</wicri:regionArea>
<placeName>
<settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Japon</country>
</affiliation>
</author>
<author>
<name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Japon</country>
<wicri:regionArea>National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo</wicri:regionArea>
<placeName>
<settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Japon</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">DF534A7FD00E3F6BCB2FC98D05CB828CE9813463</idno>
<idno type="DOI">10.1007/978-3-540-30230-8_46</idno>
<idno type="ChapterID">46</idno>
<idno type="ChapterID">Chap46</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Article citations are composed of subfields such as author, title, journal, and year. It is useful to automatically identify attributes of these subfields, since they are used for linking a citation with the actual cited article. In this article, we employ a Support Vector Machine (SVM), a method of machine learning, to automatically identify subfields. We then employ a Hidden Markov Model (HMM) to improve the identification accuracy. Information from the subfields identified by the SVM, and syntactic information analyzed by the HMM, are integrated to make an accurate identification.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Japon</li>
</country>
<settlement>
<li>Tokyo</li>
</settlement>
</list>
<tree>
<country name="Japon">
<noRegion>
<name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
</noRegion>
<name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
<name sortKey="Adachi, Jun" sort="Adachi, Jun" uniqKey="Adachi J" first="Jun" last="Adachi">Jun Adachi</name>
<name sortKey="Okada, Takashi" sort="Okada, Takashi" uniqKey="Okada T" first="Takashi" last="Okada">Takashi Okada</name>
<name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
<name sortKey="Takasu, Atsuhiro" sort="Takasu, Atsuhiro" uniqKey="Takasu A" first="Atsuhiro" last="Takasu">Atsuhiro Takasu</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001565 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001565 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:DF534A7FD00E3F6BCB2FC98D05CB828CE9813463
   |texte=   Bibliographic Component Extraction Using Support Vector Machines and Hidden Markov Models
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024